missed opportunity
VERA-MH Concept Paper
Belli, Luca, Bentley, Kate, Alexander, Will, Ward, Emily, Hawrilenko, Matt, Johnston, Kelly, Brown, Mill, Chekroud, Adam
We introduce VERA-MH (Validation of Ethical and Responsible AI in Mental Health), an automated evaluation of the safety of AI chatbots used in mental health contexts, with an initial focus on suicide risk. Practicing clinicians and academic experts developed a rubric informed by best practices for suicide risk management for the evaluation. To fully automate the process, we used two ancillary AI agents. A user-agent model simulates users engaging in a mental health-based conversation with the chatbot under evaluation. The user-agent role-plays specific personas with pre-defined risk levels and other features. Simulated conversations are then passed to a judge-agent who scores them based on the rubric. The final evaluation of the chatbot being tested is obtained by aggregating the scoring of each conversation. VERA-MH is actively under development and undergoing rigorous validation by mental health clinicians to ensure user-agents realistically act as patients and that the judge-agent accurately scores the AI chatbot. To date we have conducted preliminary evaluation of GPT-5, Claude Opus and Claude Sonnet using initial versions of the VERA-MH rubric and used the findings for further design development. Next steps will include more robust clinical validation and iteration, as well as refining actionable scoring. We are seeking feedback from the community on both the technical and clinical aspects of our evaluation.
- North America > United States > North Dakota > Oliver County > Center (0.04)
- Europe > France (0.04)
Biggest Surprises (and Missed Opportunities) of the E3 Press Conferences
It's Tuesday, which means the E3 show floor is now open. It also means we're finally at the end of a four-day slog of press conferences from some of the gaming world's largest publishers. While Activision Blizzard still doesn't do its own pre-E3 event, just about everyone else does, which means these 96 hours have been a deluge of announcements and reveals that we did our best to get our arms around. We didn't even cover them all: the Square Enix press conference was basically devoid of new information, and the PC Gaming Show, while compelling, was mostly a long list of indie game announcements--some of which we'll be getting to later this week. So, for now, here's everything you need to know about every press conference you need to know about.
- North America > United States > District of Columbia > Washington (0.05)
- Europe > United Kingdom (0.05)
- Europe > Greece (0.05)